Megalodon also uses “chunk-wise attention,” which divides the input sequence into fixed-size blocks to reduce the complexity of the model from quadratic to linear.Read More
Original Story At https://venturebeat.com/ai/meta-challenges-transformer-architecture-with-megalodon-llm/
Note: This article is automatically uploaded from feeds, SPOKEN by YOU is not responsible for the content within it.